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We present a class of channel models exhibiting varying burst error severity much like 
channels encountered in practice. We make an information-theoretic analysis of these 
channel models, and draw some conclusions that may aid in the design of coded 
communication systems for realistic noisy channels. 


I. Introduction 

Most of the published research in information theory deals 
with memoryless channels, whereas most naturally occurring 
communication channels exhibit at least some degree of bursti- 
ness, in many cases caused by radio frequency interference 
(RFI). For example, optical communication with direct detec- 
tion of photons (Ref. 1), spread-spectrum communication in 
the presence of hostile jamming (Ref. 2), and communication 
in the presence of friendly radar transmission (Ref. 3) all lead 
to chatmel models in which there are periodic bursts of poor 
data quality. In this article we shall attempt to model these 
complicated channels with a class of channels we call “RFI 
channels.” The basic idea behind these models, which we will 
develop in later sections, is that the channel noise severity is 
required to remain constant over blocks of n transmitted 
symbols. However, the channel noise severity may change 
between one block of n symbols and the next. 

Although much further work in this area remains to be 
done, we are able to draw certain conclusions from this class 
of models that may prove useful in practical situations. Infor- 
mally, our main conclusion is that the memory length n should 
be exploited to determine the noise severity within that 
block — this is a kind of “soft decision” information; once the 


noise severity has been estimated, the best strategy is to use 
«-fold coded interleaving to combat the noise. 

II. The Channel Models 

Consider the following model for a discrete channel f with 
memory. We start with a finite collection of discrete memory- 
less channels, fj , fji ' • ' fk> the same input alpha- 

bet A, and output alphabet B. When a sequence of letters Xj, 
Xj , ■ • • from A is to be transmitted over f, each block of n 
consecutive letters is sent over one of the auxihary channels 
which is selected by an external random variable Z, which 
takes values in the set {1, 2, • • • ,^}.If, for example, the fj^’s 
are all binary symmetric channels with differing raw bit-error 
probabilities, the overall channel f will be characterized by 
phased bursts of errors of varying severity. 

We consider also another channel f. This channel is exactly 
the same as f except that it provides to the receiver the index 
k of the discrete memoryless channel selected by Z. 

Our main results are these. First, the capacity of f is 
independent of n, the burst length. We denote this capacity by 
C. Second, the capacity of f does depend on n, is always less 
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than C, and if we denote the capacity of f by we have lim 
C„=C. 

Our results follow fairly easily from calculations with 
mutual information and entropy. Both channels f and f can be 
viewed as discrete memoryless channels with input alphabet 
A". For f, the output alphabet is B", and for f, the output 
alphabet is X {1, 2, • • •, k]. The transition probabilities for 
f are 


n n 


p(yM " Z) “fc n 

fc=i 1-1 


where y = (>,,••• ,y„), x = (x, , • • • , x„), p^OIx) is the 
transition probability for and is the probability that the 
channel selected is = Pr {Z = k}. For the transition 

probabilities are 


n 

p(y,lc\\) = f][ Pk(yilXi) 

i= 1 


From this memoryless viewpoint, the calculation of the 
channel capacities is simply a matter of minimizing the appro- 
priate mutual informations. For f, the capacity is 

C„ =-im^x /(X;Y) (1) 
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where X = (A', , • • -,X^) and Y<*> = (Y<*>, • • • , 

To compute C„, we are required to maximize this last 
expression over all random vectors X = (Xj, Xj , • • • , X„). The 
maximum of the inner sum in Eq. (3), taken over all choices of 
the random variable X,-, is clearly independent of k, and so 
from Eq. (2) we have 


K 

C„ = sup X; (4) 

X t = i 

where the supremum in Eq. (4) is taken over all random 
variables taking values in the input alphabet A. (If it happens 
that there is a single input distribution X that simultaneously 
achieves channel capacity on all K channels then 


Z 


fe=i 


where X and Y denote the (n-component) random inputs to 
and outputs from f . For f, the formula is 

C =-^ max/(X;Y,Z) (2) 

(We have indicated a dependence on n, but as indicated above 
the capacity C„ turns out to be independent of the burst 
length.) 

We shall consider C„ first, since its calculation is the easier 
of the two. We have, using standard results about mutual 
information (Ref. 4), 

K 

/(X;Y,Z) = Y, a^/(X;Y('^)) 

k=l 

where Y^^^^ denotes the output of the channel , if X is the 
input. Since each is memoryless, we have 


where is the capacity of .) Equation (4) thus shows that 
C„ is independent of n, and that it is in fact the capacity of 
the DMC with transition probabilities Pj^ (y\x)}- 

We turn now to the computation of C„. It is an easy 
exercise to show that 

/(X;Y,Z) -.fif(Z)<7(X;Y)</(X;Y.Z) (5) 

It thus follows that for any random vector X, 

I 7(X;Y,Z)- ^ < |7(X;Y)< i7(X;Y,Z) 

( 6 ) 

Since 77(Z) is a fixed number < log K, the left-hand Inequality 
in Eq. (6) shows that 

lim inf C >C = C 

n n 

rt-^oo 
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and the right-hand inequality shows that < C, and 
limsupC^ < C 

n-*oo 

Together these two inequalities show that lim = C, as 

asserted. (We conjecture, but have not been able to prove, that 
in fact is a monotonically increasing function of n.) 


III. An Example 

We illustrate these results with a simple example, with 
K = 2. Channel f j is a noiseless binary symmetric channel, and 
channel is ^ “useless” BSC with raw bit error probability 
1/2. 



CHANNEL C, CHANNEL 

We assume that the channel selector random variable Z is 
described by Pr { Z = 1} = 1 - e,Pr {Z = 2} = e. Thus if e is 
small, the overall channel f is characterized by noise-free 
transmission interrupted by occasional but very severe error 
bursts. 

The capacity of f j is log 2, and the capacity of is 0; both 
capacities are achieved by a uniform input distribution, and so 
by Eq. (4) C = (1 - e) log 2. A straightforward calculation 
shows that the capacities C„ are given by 


C„ = (1 - ej log 2 - {H (e„) + e„ log (1 - 2'") } 


€„ = (1 - 2"”) e, H{x) = -X log X -(1 -x) log (1 -x) 

Since e„ -> e as « ^ it follows from this that C„ -> C, but of 
course this also follows from the general results of Section II. 

How should these results be interpreted? First, we note that 
the channel f is equivalent to a channel exhibiting erasure 
bursts, since once it is known that channel fj was used to 
transmit a block of n bits, the received versions of these bits 
should be ignored, since they bear no relationship to the 


transmitted bits. And it is easy to verify that the capacity of 
such an erasure-burst channel is indeed (1 - e) log 2, whatever 
the burst length. 

It is evident that C„ ought to be less than C, since the 
receiver using f will not know when a received block of length 
ti is bad, whereas the receiver using f will, and this extra 
information cannot possibly hurt performance. But if n is very 
large, the f-users could, for example, include in the n bits in 
each transmitted packet a certain number of parity checks. To 
be specific, let us assume in fact that each packet includes logj n 
parity checks. Then if the packet is transmitted over f , . all 
of these parity checks will be satisfied upon reception. But if 
the packet is transmitted over fj, these will be parity checks 
on random data, and the probability that they will all be 
satisfied is 2~'°S2” = «“• . Thus when n is large, the presence 
of a useless data packet can be detected with high probability 
and low overhead. In other words, for large n the channel f is 
virtually identical with f, and this is what our computations 
with mutual information predicted. 

Thus if n is sufficiently large, a good strategy for communi- 
cation over f is to reserve a certain number of the bits in each 
transmitter package for parity. This number should be large 
enough so that the presence of bad data can be detected with 
high probabiUty, but small enough (relative to ti) not to sub- 
stantially reduce the transmission rate. This strategy will, as 
previously explained, effectively transform the channel into an 
erasure-burst channel. Then if n' denotes the number of bits in 
each packet not reserved for parity, one should code for the 
channel by interleaving n' copies of a code designed for use on 
the binary erasure channel (BEG). Since the capacity of the 
BEG is just as large as that of the erasure-burst channel, 
presumably there will be no performance loss. Furthermore, 
the decoding complexity of the ti' parallel binary code is much 
less than n times the complexity of decoding just one such 
code; see Ref. 5 for details. 


IV. Conclusions 

On this basis of the mutual information calculation in 
Section II, and on the basis of the example in Section III, we 
draw the following conclusions about RFI channels. First, to 
communicate reliably over f, nothing is lost by interleaving, 
and in addition there may be a considerable advantage in doing 
so. Second, while there will in general be a penalty in perfor- 
mance if interleaving on f is used, if n is large enough, it may 
be possible to accurately estimate the channel index k affect- 
ing a given data packet of length n or by using some kind of 
generalized parity check. If this can be done, then f is effec- 
tively transformed into f and then interleaving can be used 
without penalty. 
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